AITopics

Country:

Asia (0.67)
South America > Brazil > Rio de Janeiro (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Ohio (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (0.87)
Research Report > Experimental Study (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Curriculum > Subject-Specific Education (0.92)
Education > Educational Setting > Higher Education (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-27-2025, 05:27:43 GMT

Exact recovery and Bregman hard clustering of node-attributed Stochastic Block Model

Network clustering tackles the problem of identifying sets of nodes (communities) that have similar connection patterns. However, in many scenarios, nodes also have attributes that are correlated with the clustering structure. Thus, network information (edges) and node information (attributes) can be jointly leveraged to design high-performance clustering algorithms. Under a general model for the network and node attributes, this work establishes an information-theoretic criterion for the exact recovery of community labels and characterizes a phase transition determined by the Chernoff-Hellinger divergence of the model. The criterion shows how network and attribute information can be exchanged in order to have exact recovery (e.g., more reliable network information requires less reliable attribute information). This work also presents an iterative clustering algorithm that maximizes the joint likelihood, assuming that the probability distribution of network interactions and node attributes belong to exponential families. This covers a broad range of possible interactions (e.g., edges with weights) and attributes (e.g., non-Gaussian models), as well as sparse networks, while also exploring the connection between exponential families and Bregman divergences. Extensive numerical experiments using synthetic data indicate that the proposed algorithm outperforms classic algorithms that leverage only network or only attribute information as well as state-of-the-art algorithms that also leverage both sources of information. The contributions of this work provide insights into the fundamental limits and practical techniques for inferring community labels on node-attributed networks.

artificial intelligence, data mining, machine learning, (18 more...)

Country:

Europe (0.46)
South America > Brazil > Rio de Janeiro (0.28)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Neural Information Processing SystemsMar-26-2025, 08:21:37 GMT

FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models

Survival analysis is an important research topic with applications in healthcare, business, and manufacturing. One essential tool in this area is the Cox proportional hazards (CPH) model, which is widely used for its interpretability, flexibility, and predictive performance. However, for modern data science challenges such as high dimensionality (both n and p) and high feature correlations, current algorithms to train the CPH model have drawbacks, preventing us from using the CPH model at its full potential. The root cause is that the current algorithms, based on the Newton method, have trouble converging due to vanishing second order derivatives when outside the local region of the minimizer. To circumvent this problem, we propose new optimization methods by constructing and minimizing surrogate functions that exploit hidden mathematical structures of the CPH model. Our new methods are easy to implement and ensure monotonic loss decrease and global convergence. Empirically, we verify the computational efficiency of our methods. As a direct application, we show how our optimization methods can be used to solve the cardinality-constrained CPH problem, producing very sparse high-quality models that were not previously practical to construct. We list several extensions that our breakthrough enables, including optimization opportunities, theoretical questions on CPH's mathematical structure, as well as other CPH-related applications.

artificial intelligence, machine learning, natural language, (19 more...)

Country:

South America > Brazil > Rio de Janeiro (0.14)
North America > United States > California (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Neural Information Processing SystemsMar-23-2025, 15:10:21 GMT

On Divergence Measures for Training GFlowNets

Generative Flow Networks (GFlowNets) are amortized samplers of unnormalized distributions over compositional objects with applications to causal discovery, NLP, and drug design. Recently, it was shown that GFlowNets can be framed as a hierarchical variational inference (HVI) method for discrete distributions. Despite this equivalence, attempts to train GFlowNets using traditional divergence measures as learning objectives were unsuccessful. Instead, current approaches for training these models rely on minimizing the log-squared difference between a proposal (forward policy) and a target (backward policy) distribution. In this work, we first formally extend the relationship between GFlowNets and HVI to distributions on arbitrary measurable topological spaces. Then, we empirically show that the ineffectiveness of divergence-based learning of GFlowNets is due to the large gradient variance of the corresponding stochastic objectives. To address this issue, we devise a collection of provably variance-reducing control variates for gradient estimation based on the REINFORCE leave-one-out estimator. Our experimental results suggest that the resulting algorithms often accelerate training convergence when compared against previous approaches. All in all, our work contributes by narrowing the gap between GFlowNet training and HVI, paving the way for algorithmic advancements inspired by the divergence minimization viewpoint.

artificial intelligence, machine learning, natural language, (21 more...)

Country:

South America > Brazil > Rio de Janeiro (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(3 more...)

Neural Information Processing SystemsMar-23-2025, 10:20:00 GMT

EPIC Fields Marrying 3D Geometry and Video Understanding Supplementary Material Ahmad Darkhalil David Fouhey

In this supplementary material, we first describe the companion video that provides an overview of our dataset (Section 1) and then detail how the data was released (Section 2) along with taking stock of additional information specifically promised in the checklist (Section 3). Next, we provide additional details on the dataset construction (Section 4) and on the benchmarks (Section 5). We devote a final section (Section 6) to showing that the EPIC Fields pipeline could be applied to reconstructing videos from the Ego4D dataset. We provide a short video in the form of a trailer at https://youtu.be/RcacE26eObE. It allows to visually assess how challenging the reconstruction problem is and hints at how frame filtering helps. The video also illustrates how the new camera poses complement the existing semantic annotations for this dataset (hands and active objects), showcasing the potential of marrying 3D geometry and video understanding.

artificial intelligence, machine learning, reconstruction, (16 more...)

Country:

Europe (0.28)
South America > Brazil > Rio de Janeiro > South Atlantic Ocean (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.70)

Barberia, Lorena G, Lombard, Belinda, Roman, Norton Trevisan, Sousa, Tatiane C. M.

Clarifying Misconceptions in COVID-19 Vaccine Sentiment and Stance Analysis and Their Implications for Vaccine Hesitancy Mitigation: A Systematic Review

arXiv.org Artificial IntelligenceMar-23-2025

Background Advances in machine learning (ML) models have increased the capability of researchers to detect vaccine hesitancy in social media using Natural Language Processing (NLP). A considerable volume of research has identified the persistence of COVID-19 vaccine hesitancy in discourse shared on various social media platforms. Methods Our objective in this study was to conduct a systematic review of research employing sentiment analysis or stance detection to study discourse towards COVID-19 vaccines and vaccination spread on Twitter (officially known as X since 2023). Following registration in the PROSPERO international registry of systematic reviews, we searched papers published from 1 January 2020 to 31 December 2023 that used supervised machine learning to assess COVID-19 vaccine hesitancy through stance detection or sentiment analysis on Twitter. We categorized the studies according to a taxonomy of five dimensions: tweet sample selection approach, self-reported study type, classification typology, annotation codebook definitions, and interpretation of results. We analyzed if studies using stance detection report different hesitancy trends than those using sentiment analysis by examining how COVID-19 vaccine hesitancy is measured, and whether efforts were made to avoid measurement bias. Results Our review found that measurement bias is widely prevalent in studies employing supervised machine learning to analyze sentiment and stance toward COVID-19 vaccines and vaccination. The reporting errors are sufficiently serious that they hinder the generalisability and interpretation of these studies to understanding whether individual opinions communicate reluctance to vaccinate against SARS-CoV-2. Conclusion Improving the reporting of NLP methods is crucial to addressing knowledge gaps in vaccine hesitancy discourse.

artificial intelligence, machine learning, natural language, (17 more...)

2503.18095

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.46)
South America > Brazil > Rio de Janeiro (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.92)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.78)

arXiv.org Artificial IntelligenceMar-21-2025

LLMs Love Python: A Study of LLMs' Bias for Programming Languages and Libraries

Twist, Lukas, Zhang, Jie M., Harman, Mark, Syme, Don, Noppen, Joost, Nauck, Detlef

Programming language and library choices are crucial to software reliability and security. Poor or inconsistent choices can lead to increased technical debt, security vulnerabilities, and even catastrophic failures in safety-critical systems. As Large Language Models (LLMs) play an increasing role in code generation, it is essential to understand how they make these decisions. However, little is known about their preferences when selecting programming languages and libraries for different coding tasks. To fill this gap, this study provides the first in-depth investigation into LLM preferences for programming languages and libraries used when generating code. We assess the preferences of eight diverse LLMs by prompting them to complete various coding tasks, including widely-studied benchmarks and the more practical task of generating the initial structural code for new projects (a crucial step that often determines a project's language or library choices). Our findings reveal that LLMs heavily favour Python when solving language-agnostic problems, using it in 90%-97% of cases for benchmark tasks. Even when generating initial project code where Python is not a suitable language, it remains the most-used language in 58% of instances. Moreover, LLMs contradict their own language recommendations in 83% of project initialisation tasks, raising concerns about their reliability in guiding language selection. Similar biases toward well-established libraries further create serious discoverability challenges for newer open-source projects. These results highlight the need to improve LLMs' adaptability to diverse programming contexts and to develop mechanisms for mitigating programming language and library bias.

large language model, machine learning, programming language, (18 more...)

2503.17181

Country:

North America > United States (0.47)
South America > Brazil > Rio de Janeiro (0.14)
North America > Mexico > Mexico City (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Tsirmpas, Dimitris, Androutsopoulos, Ion, Pavlopoulos, John

Scalable Evaluation of Online Moderation Strategies via Synthetic Simulations

arXiv.org Artificial IntelligenceMar-13-2025

Despite the ever-growing importance of online moderation, there has been no large-scale study evaluating the effectiveness of alternative moderation strategies. This is largely due to the lack of appropriate datasets, and the difficulty of getting human discussants, moderators, and evaluators involved in multiple experiments. In this paper, we propose a methodology for leveraging synthetic experiments performed exclusively by Large Language Models (LLMs) to initially bypass the need for human participation in experiments involving online moderation. We evaluate six LLM moderation configurations; two currently used real-life moderation strategies (guidelines issued for human moderators for online moderation and real-life facilitation), two baseline strategies (guidelines elicited for LLM alignment work, and LLM moderation with minimal prompting) a baseline with no moderator at all, as well as our own proposed strategy inspired by a Reinforcement Learning (RL) formulation of the problem. We find that our own moderation strategy significantly outperforms established moderation guidelines, as well as out-of-the-box LLM moderation. We also find that smaller LLMs, with less intensive instruction-tuning, can create more varied discussions than larger models. In order to run these experiments, we create and release an efficient, purpose-built, open-source Python framework, dubbed "SynDisco" to easily simulate hundreds of discussions using LLM user-agents and moderators. Additionally, we release the Virtual Moderation Dataset (VMD), a large dataset of LLM-generated and LLM-annotated discussions, generated by three families of open-source LLMs accompanied by an exploratory analysis of the dataset.

large language model, machine learning, natural language, (18 more...)

2503.16505

Country:

Europe > Middle East > Malta (0.14)
South America > Brazil > Rio de Janeiro (0.14)
North America > United States > Maryland (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government (0.67)
Media > News (0.46)
Law Enforcement & Public Safety > Terrorism (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-13-2025

Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation

Waldmann, Leonard, Shah, Ando, Wang, Yi, Lehmann, Nils, Stewart, Adam J., Xiong, Zhitong, Zhu, Xiao Xiang, Bauer, Stefan, Chuang, John

Earth observation (EO) data features diverse sensing platforms with varying spectral bands, spatial resolutions, and sensing modalities. While most prior work has constrained inputs to fixed sensors, a new class of any-sensor foundation models able to process arbitrary sensors has recently emerged. Contributing to this line of work, we propose Panopticon, an any-sensor foundation model built on the DINOv2 framework. We extend DINOv2 by (1) treating images of the same geolocation across sensors as natural augmentations, (2) subsampling channels to diversify spectral input, and (3) adding a cross attention over channels as a flexible patch embedding mechanism. By encoding the wavelength and modes of optical and synthetic aperture radar sensors, respectively, Panopticon can effectively process any combination of arbitrary channels. In extensive evaluations, we achieve state-of-the-art performance on GEO-Bench, especially on the widely-used Sentinel-1 and Sentinel-2 sensors, while out-competing other any-sensor models, as well as domain adapted fixed-sensor models on unique sensor configurations. Panopticon enables immediate generalization to both existing and future satellite platforms, advancing sensor-agnostic EO.

artificial intelligence, dataset, machine learning, (18 more...)

2503.10845

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.70)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Sensing and Signal Processing (0.66)

Benoit, Alexandre, Asef, Pedram

Navigating Intelligence: A Survey of Google OR-Tools and Machine Learning for Global Path Planning in Autonomous Vehicles

arXiv.org Artificial IntelligenceMar-5-2025

We offer a new in-depth investigation of global path planning (GPP) for unmanned ground vehicles, an autonomous mining sampling robot named ROMIE. GPP is essential for ROMIE's optimal performance, which is translated into solving the traveling salesman problem, a complex graph theory challenge that is crucial for determining the most effective route to cover all sampling locations in a mining field. This problem is central to enhancing ROMIE's operational efficiency and competitiveness against human labor by optimizing cost and time. The primary aim of this research is to advance GPP by developing, evaluating, and improving a cost-efficient software and web application. We delve into an extensive comparison and analysis of Google operations research (OR)-Tools optimization algorithms. Our study is driven by the goal of applying and testing the limits of OR-Tools capabilities by integrating Reinforcement Learning techniques for the first time. This enables us to compare these methods with OR-Tools, assessing their computational effectiveness and real-world application efficiency. Our analysis seeks to provide insights into the effectiveness and practical application of each technique. Our findings indicate that Q-Learning stands out as the optimal strategy, demonstrating superior efficiency by deviating only 1.2% on average from the optimal solutions across our datasets.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)